BBW: a batch balance wrapper for training deep neural networks on extremely imbalanced datasets with few minority samples

نویسندگان

چکیده

Abstract In recent years, Deep Neural Networks (DNNs) have achieved excellent performance on many tasks, but it is very difficult to train good models from imbalanced datasets. Creating balanced batches either by majority data down-sampling or minority up-sampling can solve the problem in certain cases. However, may lead learning process instability and overfitting. this paper, we propose Batch Balance Wrapper (BBW), a novel framework which adapt general DNN be well trained extremely datasets with few samples. BBW, two extra network layers are added start of DNN. The prevent overfitting samples improve expressiveness sample distribution Furthermore, (BB), class-based sampling algorithm, proposed make sure each batch always during process. We test BBW three well-known maximum imbalance ratio reaches 1167:1 only 16 positive Compared existing approaches, achieves better classification performance. addition, BBW-wrapped DNNs 16.39 times faster, relative unwrapped DNNs. Moreover, does not require preprocessing additional hyper-parameter tuning, operations that processing time. experiments prove applied common applications samples, such as EEG signals, medical images so on.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks

Training deep neural networks with Stochastic Gradient Descent, or its variants, requires careful choice of both learning rate and batch size. While smaller batch sizes generally converge in fewer training epochs, larger batch sizes offer more parallelism and hence better computational efficiency. We have developed a new training approach that, rather than statically choosing a single batch siz...

متن کامل

A Scalable Near-Memory Architecture for Training Deep Neural Networks on Large In-Memory Datasets

Most investigations into near-memory hardware accelerators for deep neural networks have primarily focused on inference, while the potential of accelerating training has received relatively little attention so far. Based on an in-depth analysis of the key computational patterns in state-of-the-art gradient-based training methods, we propose an efficient near-memory acceleration engine called NT...

متن کامل

Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance

This study investigates the effect of class imbalance in training data when developing neural network classifiers for computer-aided medical diagnosis. The investigation is performed in the presence of other characteristics that are typical among medical data, namely small training sample size, large number of features, and correlations between features. Two methods of neural network training a...

متن کامل

Neural Voice Cloning with a Few Samples

Voice cloning is a highly desired feature for personalized speech interfaces. Neural network based speech synthesis has been shown to generate high quality speech for a large number of speakers. In this paper, we introduce a neural voice cloning system that takes a few audio samples as input. We study two approaches: speaker adaptation and speaker encoding. Speaker adaptation is based on fine-t...

متن کامل

Batch Kalman Normalization: Towards Training Deep Neural Networks with Micro-Batches

As an indispensable component, Batch Normalization (BN) has successfully improved the training of deep neural networks (DNNs) with mini-batches, by normalizing the distribution of the internal representation for each hidden layer. However, the effectiveness of BN would diminish with scenario of micro-batch (e.g. less than 10 samples in a mini-batch), since the estimated statistics in a mini-bat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied Intelligence

سال: 2021

ISSN: ['0924-669X', '1573-7497']

DOI: https://doi.org/10.1007/s10489-021-02623-9